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Art Unit: 2166 

DETAILED ACTION 



1. Claims 1-20 are pending in this office action. 



Information Disclosure Statement 



2. Applicants 1 Information Disclosure Statements, filed on 3/24//2004 and 8/10/2004 
have been received, entered and considered. See attached form PTO-1449. 



Specification 



3. Cross reference to related applications section, given in the specification needs 
to be updated by including the U.S. Patent Application Serial number. 

The following is a quote in part of MPEP 608.01 (p), concerning the incorporation 
of subject matter by reference: 

"The Commissioner has considerable discretion in determining what may or may 
not be incorporated by reference in a patent application. General Electric Co. v. 
Brenner, 407 F.2d 1258, 159 USPQ 335 (D.C. Cir. 1968). The incorporation by 
reference practice with reference to applications which issue as U.S. patents provides 
the public with a patent disclosure which minimizes the public's burden to search for and 
obtain copies of documents incorporated by reference which may not be readily 
available. Through the Office's incorporation by reference policy the Office ensures that 
reasonably complete disclosures are published as U.S. patents. The following is the 
manner in which the Commissioner has elected to exercise that discretion. 

An application as filed must be complete in itself in order to comply with 35 
U.S.C. 1 12. Material nevertheless may be incorporated by reference, Ex parte 
Schwarze, 151 USPQ 426 (Bd. App. 1966). An application for a patent when filed may 
incorporate "essential material" by reference to (1) a U.S. patent or (2) a pending U.S. 
application, subject to the conditions set forth below. 
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"Essential material" is defined as that which is necessary to (1) describe the 
claimed invention, (2) provide an enabling disclosure of the claimed invention, or (3) 
describe the best mode (35 U.S.C. 112). In any application which is to issue as a U.S. 
patent, essential material may not be incorporated by reference to (1) patents or 
applications published by foreign countries or a regional patent office, (2) non-patent 
publications, (3) a U.S. patent or application which itself incorporates "essential 
material" by reference, or (4) a foreign application. 

Nonessential subject matter may be incorporated by reference to (1) patents or 
applications published by the United States or foreign countries or regional patent 
offices, (2) prior filed, commonly owned U.S. applications, or (3) non-patent 
publications. Nonessential subject matter is subject matter referred to for purposes of 
indicating the background of the invention or illustrating the state of the art. 

Mere reference to another application, patent, or publication is not an 
incorporation of anything therein into the application containing such reference for the 
purpose of the disclosure required by 35 U.S.C. 112, first paragraph. In re de Seversky, 
474 F.2d 671, 177 USPQ 144, (CCPA 1973). In addition to other requirements for an 
application, the referencing application should include an identification of the referenced 
patent, application, or publication. Particular attention should be directed to specific 
portions of the referenced document where the subject matter being incorporated may 
be found. Guidelines for situations where applicant is permitted to fill in a number for 

Serial No. left blank in the application as filed can be found in In re 

Fouche, 439 F.2d 1237, 169 USPQ 429 (CCPA 1971) (Abandoned applications less 
than 20 years old can be incorporated by reference to same extent as copending 
applications; both types are open to public upon referencing application issuing as a 
patent)." 



4. The specification is objected to as failing to provide proper antecedent basis for 
the claimed subject matter. See 37 CFR 1.75(d)(1) and MPEP § 608.01 (o). Correction 
of the following is required: Claims 1, 5-8, 11, 15- 18, and 20 contain limitations first, 
second, third and fourth percentage. The specification does not describe which one of 
the percentages is first, second, third or fourth. 



Claim Rejections - 35 USC § 101 
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5. Claims 11-19 are rejected under 35 U.S.C. 101 as being directed to non-statutory 
subject matter. The language of the claims raises a question as to whether the claims 
are directed merely to an environment or machine which would result in a practical 
application producing a concrete useful, and tangible result to form the basis of statutory 
subject matter under 35 U.S.C. 101. 

Claims 11-19 are rejected because the system recited in these claims requires 
hardware components such as memory or processor in order to indicate as to how their 
functionality is being realized and to provide any tangible results. It is also unclear that, 
what is applicant referring to in the specification when he recites means for receiving, 
computing and using in the claims. 

To expedite a complete examination of the instant application the claims rejected 
under U.S.C. 101 (nonstatutory) above are further rejected as set forth below in 
anticipation of application amending these claims to place them within the four 
categories of invention. 

Claim Rejections - 35 USC § 102 

6. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
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applicant for patent, except that an international application filed under the treaty defined in section 
351(a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 

Claims 1-2, 4-5, 10-12, 14-15, and 20 are rejected under 35 U.S.C. 102(b) as 
being anticipated by Steven J. Simske. (Simske hereinafter) (U.S. PG Pub No. 
2004/0133560). 

With respect to claim 1 , Simske teaches a method for computing a measure 
of similarity between a first (or input) document and a second (or search results) 
document, comprising: 

"(a) receiving a first list of rated keywords extracted from the first 
document and a second list of rated keywords extracted from the second 
document" as organizing electronic documents may include generating a list of 
weighted keywords for each document (Simske Abstract, & Fig. 4). 

"(b) using the first and second lists of rated keywords to determine 
whether the first document forms part of the second document using a first 
computed percentage indicating what percentage of keyword ratings in the first 
list also exist in the second list" as the clustering process begins when the weighted 
keyword lists of two or more documents are compared (step 601). The host device 
calculates a value, called "shared word weight," that correlates the two documents. The 
shared word weight value indicates the extent to which two or more documents are 
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related based on their keywords. A higher shared word weight indicates that the 
documents are more likely to be related (Simske Paragraph 0048). 

"(c) computing a second percentage indicating what percentage of 
keyword ratings along with a set of their neighboring keyword ratings in the first 
list also exist in the second list when the first computed percentage indicates that 
the first document is included in the second document" as another possible way of 
weighting the relevancy metrics is to multiply the mean shared weight of extended 
words shared by two selected text units, e.g., sentences, by the frequency metric of the 
shared extended words, i.e., the mean ratio of the extended word occurrences in the 
two documents compared to their occurrences in the larger corpus (Simske Paragraph 
0064). 

"(d) using the first computed percentage to specify the measure of 
similarity when the second computed percentage is greater than the first 
computed percentage" as clustering documents with common titles, using weighted 
keywords to determine similarities between documents, etc., a preferred method uses a 
threshold shared word weight and a maximum, mean, or minimum shared word weight 
as explained above (Simske Paragraph 0055). 

Claims 1 1 and 20 are essentially the same as claim 1 except they set forth the 
claimed invention as a system and an article of manufacture and are rejected for the 
same reasons as applied hereinabove. 
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With respect to claim 2, Simske teaches "the method according to claim 1, 
wherein the second percentage at (c) is computed by giving weight only to those 
keywords and their set of neighboring keywords in the first list that match in the 
second list and a threshold percentage of the keywords in their set of 
neighboring keywords" as shown in Table 5, the documents share two keywords, 
"Hockey" and "Skating." The shared word weight value of the keywords may be chosen 
in a variety of ways, e.g., maximum, mean, and minimum (Simske Paragraph 0050). 

Claim 12 is essentially the same as claim 2 except it sets forth the claimed 
invention as a system and is rejected for the same reasons as applied hereinabove. 

With respect to claim 4, Simske teaches "the method according to claim 2, 
wherein the threshold percentage is reduced when the first list of rated keywords 
is identified using OCR" as the documents included in each cluster may be adjusted 
by changing the threshold of the required shared word weight for clustering (Simske 
Paragraph 0058). If any documents being considered are paper-based, tools such as a 
zoning analysis engine in combination with an optical character recognition (OCR) 
engine may be used to convert the paper-based document to an electronic document 
(Simske Paragraph 0016). 

Claim 14 is essentially the same as claim 4 except it sets forth the claimed 
invention as a system and is rejected for the same reasons as applied hereinabove. 
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With respect to claim 5, Simske teaches "the method according to 
claim 1, further comprising (e) if the first computed percentage does not indicate 
that the first document is included in the second document, computing a third 
percentage using the Jaccard distance measure" as the shared word weight value 
indicates the extent to which two or more documents are related based on their 
keywords. A higher shared word weight indicates that the documents are more likely to 
be related (Simske Paragraph 0048). Examiner interprets that when a document is 
related/included in a second document it does not need to calculate third percentage. 
Alternatively if it does not indicate then examiner interprets the above limitation as the 
relevance weight for A is calculated, as shown, by summing (step 704), the weight of B 
divided by the distance of B (as measured in characters) from A (step 703), the weight 
of C divided by the distance of C from A (step 703), the weight of D divided by the 
distance of D from A (step 703), then multiplying that sum by the weight of A (step 705). 
The summation of keyword weights divided by their respective distances to a particular 
occurrence can be called a "distance metric" (step 704) (Simske Paragraph 0062). 

Claim 15 is essentially the same as claim 5 except it sets forth the claimed 
invention as a system and is rejected for the same reasons as applied hereinabove. 

With respect to claim 10, Simske teaches, "the method according to claim 1, 
wherein the first document is a portion of the second document" as a method and 
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system for organizing electronic documents by generating a list of weighted keywords, 
clustering documents sharing one or more keywords, and linking documents within a 
cluster by using similar keywords, sentences, paragraphs, etc., as links. The 
embodiments provide customizable user control of keyword quantities, cluster 
selectivity, and link specificity, i.e., links may connect similar paragraphs, sentences, 
individual words, etc (Simske Paragraph 0015). 

Claim Rejections • 35 USC § 103 

7. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

This application currently names joint inventors. In considering patentability of 
the claims under 35 U.S.C. 103(a), the examiner presumes that the subject matter of 
the various claims was commonly owned at the time any inventions covered therein 
were made absent any evidence to the contrary. Applicant is advised of the obligation 
under 37 CFR 1 .56 to point out the inventor and invention dates of each claim that was 
not commonly owned at the time a later invention was made in order for the examiner to 
consider the applicability of 35 U.S.C. 103(c) and potential 35 U.S.C. 102(e), (f) or (g) 
prior art under 35 U.S.C. 103(a). 
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Claims 3, 6-7, 13, and 16-17 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Steven J. Simske. (U.S. PG Pub No. 2004/0133560) as applied to 
claims 1-2, 4-5, 10-12, 14-15, and 20 above, in view of Rie Kubota. (Kubota 
hereinafter) (U.S. Patent No. 6,041323). 

With respect to claim 3, Simske teaches "the method according to claim 2, 
wherein the second percentage at (c) is computed by giving full weight to those 
keywords in the first list of rated keywords that cannot be accurately identified as 
having a complete set of neighboring keywords in the second set of keywords" 

as the experiment consists of varying the weighting, e.g., ranging the weight from 0.1 to 
10.0 using 0.1 steps, for a particular attribute (Simske Paragraph 0032). Examiner 
consider 1 0 as being full weight. 

Simske teaches the elements of claim 3 as noted above but does not explicitly 
disclose "keywords that cannot be accurately identified as having a complete set 
of neighboring keywords in the second set of keywords." 
However, Kubota discloses "keywords that cannot be accurately identified as 
having a complete set of neighboring keywords in the second set of keywords" 
as the fixed length chain is searched from the character chain file. In step 508, if it is 
determined that no fixed length chain is found, a message box is preferably displayed in 
step 526 for indicating that the search character string cannot be found, and the process 
ends (Kubota Col 26, Lines 44-48). Therefore the reference teaches that keywords are 
not found in the second set of keywords/document. 
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It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teaching of the cited references because Kubota's 
teachings would have allowed Simske to provide a search method, which requires less 
storage capacity and extracts a unique character string at a high speed (Kubota Col 2, 
Lines 51-53) and to provide a method for searching for a comparison document, which 
has character strings similar to a partial input character string existing in an input 
document (Kubota Col 2, Lines 3-6). 

Claim 13 is essentially the same as claim 3 except it sets forth the claimed 
invention as a system and is rejected for the same reasons as applied hereinabove. 

With respect to claim 6, Simske does not explicitly teaches "the method 
according to claim 5, further comprising (f) if the third computed percentage 
indicates that the first document is a revision of the second document, 
computing a fourth percentage indicating what percentage of keyword ratings 
along with a set of their neighboring keyword ratings in the second list also exist 
in the first list" 

However, Kubota discloses "the method according to claim 5, further 
comprising (f) if the third computed percentage indicates that the first document 
is a revision of the second document, computing a fourth percentage indicating 
what percentage of keyword ratings along with a set of their neighboring keyword 
ratings in the second list also exist in the first list" as in the case of multiple 
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documents, it may be a set of documents including the input document, or a set of 
documents extracted by search or the like (Kubota Col 3, Lines 63-66). Examiner 
interprets the input document as revision since the input document is the output to the 
search using keywords in the input document. 

Calculating the similarity factor of the comparison document from the first appearance 
frequency value taking the first weight value into account and the second appearance 
frequency value taking the second weight value into account (Kubota Col 6, Lines 22- 
25). Examiner interprets comparison document as second document. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teaching of the cited references because Kubota's 
teachings would have allowed Simske to provide a search method, which requires less 
storage capacity and extracts a unique character string at a high speed (Kubota Col 2, 
Lines 51-53) and to provide a method for searching for a comparison document, which 
has character strings similar to a partial input character string existing in an input 
document (Kubota Col 2, Lines 3-6). 

Claim 16 is essentially the same as claim 6 except it sets forth the claimed 
invention as a system and is rejected for the same reasons as applied hereinabove. 

With respect to claim 7, Simske teaches "the method according to claim 6, 
further comprising using the fourth computed percentage to specify the measure 
of similarity except when: (i) the fourth computed percentage is greater than the 
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second computed percentage; (ii) the first list of rated keywords is identified 
using OCR; (iii) the fourth computed percentage is greater than fifty percent; and 
(iv) less than twenty percent of the keywords in the first list of keywords are in 
the second list of keywords" as if any documents being considered are paper-based, 
tools such as a zoning analysis engine in combination with an optical character 
recognition (OCR) engine may be used to convert the paper-based document to an 
electronic document (Simske Paragraph 0016). The keywords in the documents are 
being identifies using OCR in the reference. Therefore, there is no need for using fourth 
computed percentage to specify the measure of similarity. 

Claim 17 is essentially the same as claim 7 except it sets forth the claimed 
invention as a system and is rejected for the same reasons as applied hereinabove. 

8. Claims 9 and 19 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Steven J. Simske. (U.S. PG Pub No. 2004/0133560) as applied to claims 1-2, 4-5, 10- 
12, 14-15, and 20 above, in view of Drissi et al. (Drissi hereinafter) (U.S. PG Pub No. 
20003/0149686). 

With respect to claim 9, Simske does not explicitly teaches "the method 
according to claim 1, wherein the first list of rated keywords includes one or more 
keywords translated from a second language different from a first language that 
is identified as being a primary language of the first document." 
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However, Drissi discloses "the method according to claim 1, wherein the 
first list of rated keywords includes one or more keywords translated from a 
second language different from a first language that is identified as being a 
primary language of the first document" as an inverted index 214 is created from the 
translated keywords. The translation of keywords is preferably accomplished using a 
keyword dictionary 220 which included words in English associated with the 
corresponding keywords in the national language (and vice versa) to form a synonym 
listing which effectively translates a keyword in one language into the corresponding 
term in another language and vice versa) (Drissi Paragraph 0024). Examiner interprets 
the national language as primary language. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teaching of the cited references because Drissi's 
teachings would have allowed Simske to provide translation process to allow searching 
of the documents in different languages (Drissi Paragraph 0012). 

Claim 19 is essentially the same as claim 9 except it sets forth the claimed 
invention as a system and is rejected for the same reasons as applied hereinabove. 

9. Claims 8 and 18 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Steven J. Simske. (U.S. PG Pub No. 2004/0133560). 
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With respect to claim 8 Simske teaches "the method according to claim 1, 
wherein the first computed percentage indicates that the first document is 
included in the second document when the percentage defined by ratio of 
Sum1/Sum2 is greater than approximately ninety percent, where" as for example, if 
a threshold shared word weight value of 0.7 is designated, and the two documents of 
Table 5 are being compared for possible clustering, using the maximum shared word 
weight value (1 .0) will cluster the two documents, while using the mean shared word 
weight (0.5) or minimum shared word weight values (0.3) will not cluster the two 
documents (Simske Paragraph 0052). Examiner interprets the threshold value of 70 
percent as 90 percent. 

"D1 is the number of keywords in first list of keywords" as table 5 with 
keywords from document 1 and document 2 (Simske Paragraph 0049). 

"D2 is the number of keywords in the second list of keywords" as table 5 
with keywords from document 1 and document 2 (Simske Paragraph 0049). 

"Sum1 is the sum of the weights of keywords that appear in D1 that also 
appear in D2" as the sum of all weight values for "Hockey" and "Skating" is 
0.4+0.25+0.3+0.05=1.0 (Simske Paragraph 0052). Hokey and Skating appear in both 
D1 and D2. 

"Sum2 is the sum of the weights of keywords in D1" as the keywords are 
located, a sentence weight is calculated (502), for example, by adding together all the 
keyword weights (Simske Paragraph 0045). 
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Simske teaches the elements of claim 8 as noted above but does not explicitly 
discloses "Sum1/Sum2." 

However, Simske teaches "Sum1/Sum2" as the mean shared word weight 
value is [fraction (1.0/2)]=0.5 (Simske Paragraph 0052). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited reference to find the ratio for 
two possible similar documents by dividing the sum of keywords from both documents 
by sum of keywords in one document. 

Claim 18 is essentially the same as claim 8 except it sets forth the claimed 
invention as a system and is rejected for the same reasons as applied hereinabove. 

Conclusion 

10. The prior art made of record and not replied upon is considered pertinent to 
applicant's disclosure is listed on 892 form. 

Contact Information 

1 1 . Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Usmaan Saeed whose telephone number is (571)272- 
4046. The examiner can normally be reached on M-F 8-5. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Hosain Alam can be reached on (571)272-3978. The fax phone number for 
the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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